Multiple Testing

Haky Im

2024-03-27

The Perils of Multiple Testing

The Perils of Multiple Testing (cont)

What happened? Is the conclusion sensible?

::: . . .

What is the probability of not rejecting null when we should?

  • what is the probability of not rejecting the null at 0.05 significance level when testing once?

    • \(1 - 0.05 = 0.95\)
  • what is the probability of not rejecting the null at 0.05 significance level when testing twice?

    • \((1 - 0.05)^2 = 0.95^2 = 0.9025\)
  • what is the probability of not rejecting the null at 0.05 significance level when testing \(100\) times?

    • \((1 - 0.05)^10 = 0.95^10 = 0.0059\)

How to solve multiple testing problem?

  • Bonferroni Correction

  • Significance level: \(\frac{0.05}{\text{# of tests}}\)

  • Pros and cons of using Bonferroni correction

Genome-Wide Significance Level

  • Typical level: \(5 \times 10^{-8}\)
  • Relation to number of tests in GWAS.

Distribution of P-values Under the Null

Simulations of P-values

  • Under null and alternative hypotheses.
  • Generating distribution of p-values through simulations.

Common Approaches to Correct for Multiple Testing

  • Bonferroni correction
  • False Discovery Rate (FDR) control
  • Family-Wise Error Rate (FWER) control

Using qvalue Package

  • Calculating q-values and controlling FDR in R.

References

Storey, John D., and Robert Tibshirani. 2003. “Statistical Significance for Genomewide Studies.” Proceedings of the National Academy of Sciences 100 (16): 9440–45.